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Sensor  Modeling  and  Multi-Sensor  Data  Fusion 


1.  Introduction 

The  desire  to  enhance  the  capabilities  of  modern  control  systems  has  led  to  the  development  of 
complex  systems  that  are  characterized  by  their  increased  non-linearity,  flexibility,  intelligence 
and  enhanced  ability  to  handle  uncertainty.  In  order  to  incorporate  autonomous  decision  making 
abilities,  these  systems  need  to  possess  complex  capabilities  such  as  perception,  knowledge 
acquisition,  learning,  adaptability,  and  reasoning.  Moreover,  the  system  should  be  able  to  draw 
inference  from  incomplete,  ambiguous  or  approximate  infonnation,  and  deal  with  uncertain  and 
dynamic  situations.  The  revolutionary  advancement  in  the  field  of  sensor  technology  that  has  led 
to  development  of  superior  sensing  capabilities,  and  progress  in  computing  and  information 
processing  has  made  it  possible  to  develop  systems  with  autonomous  decision  making 
capabilities. 

Dynamic  systems  generally  employ  multiple  sensors  to  provide  diverse,  complementary  as 
well  as  redundant  information.  The  primary  goal  of  a  multi-sensor  system  is  to  combine 
information  from  a  multitude  of  sources  into  a  robust,  accurate  and  consistent  environment 
description.  There  are  several  issues  that  arise  while  solving  the  multiple  sensor  fusion  problem 
including  inherent  uncertainty  in  each  sensor’s  measurements,  and  diverse  and  temporally  or 
spatially  disparate  nature  of  measurements.  The  uncertainties  in  sensors  not  only  arise  from  the 
impreciseness  and  noise  in  the  measurements,  but  are  also  caused  by  the  ambiguities  and 
inconsistencies  in  the  environment,  and  inability  to  distinguish  between  them.  The  strategies 
used  to  fuse  data  from  these  sensors  should  be  able  to  eliminate  such  uncertainties,  take  into 
account  the  environmental  parameters  that  affect  sensor  measurements,  and  fuse  different  nature 
of  information  to  obtain  a  consistent  description  of  the  environment. 

The  algorithms  reported  in  literature  to  fuse  data  from  multiple  sensory  sources  can  be 
classified  into  three  categories:  1)  Fusion  based  on  probabilistic  methods,  2)  Fusion  based  on 
least-squares  techniques,  and  3)  Fusion  based  on  intelligent  methods.  All  of  these  methods  differ 
in  the  manner  they  try  to  model  the  uncertainties  inherent  in  the  sensor  measurements.  The 
research  carried  out  in  this  project  combines  two  different  kinds  of  sensing  modalities  to  obtain 
three-dimensional  occupancy  profiles  of  a  robotic  workspace.  The  first  modality  is  two  vision 
sensors  mounted  on  a  stereo  rig,  and  the  second  one  is  an  Infra-Red  (IR)  proximity  sensor.  The 
research  presented  in  this  report  first  discusses  a  neural  network  based  novel  technique  to  obtain 
probabilistic  sensor  models.  The  fusion  is  carried  out  in  an  occupancy  grid  framework  with  the 
help  of  Bayesian  approach.  Some  of  the  other  techniques  used  in  literature  for  sensor  fusion 
include  Dempster  Shafer  theory  for  evidential  reasoning  [1-2],  fuzzy  logic  [3-4],  and  statistical 
techniques  [5]  such  as  Kalman  filter  [6-8].  The  following  section  briefly  discusses  the  Bayesian 
approach. 
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2.  Bayesian  approach  for  sensor  fusion 

Bayesian  inference  [9-11]  is  a  statistical  data  fusion  algorithm  based  on  Bayes’  theorem  [12] 
of  conditional  or  a  posteriori  probability  to  estimate  an  n-dimensional  state  vector  ‘X ,  after  the 
observation  or  measurement  denoted  by  ‘Z’  has  been  made.  The  probabilistic  information 
contained  in  Z  about  X  is  described  by  a  probability  density  function  (p.d.f.)  p(Z  \  X),  known  as 
likelihood  function,  or  the  sensor  model,  which  is  a  sensor  dependent  objective  function  based 
on  observation.  The  likelihood  function  relates  the  extent  to  which  the  a  posteriori  probability  is 
subject  to  change,  and  is  evaluated  either  via  offline  experiments  or  by  utilizing  the  available 
information  about  the  problem.  If  the  information  about  the  state  X  is  made  available 
independently  before  any  observation  is  made,  then  likelihood  function  can  be  improved  to 
provide  more  accurate  results.  Such  a  priori  information  about  X  can  be  encapsulated  as  the  prior 
probability  p{x  =  x)  and  is  regarded  as  subjective  because  it  is  not  based  on  observed  data. 
Bayes’  theorem  provides  the  posterior  conditional  distribution  of  X  =  x,  given  Z  =  z,  as 


p(X-xlZ-z)-  P(Z  =  z\X  =  x)P(X  =  x)  _p(Z  =  z\X  =  x)P(X  =  x)  (!) 

\p{Z  =  z\X  =  x)P{X  =  x)dx  P(Z  =  z) 

Since  the  denominator  depends  only  on  the  measurement  (the  summation  is  carried  out  over  all 
possible  values  of  state),  an  intuitive  approach  to  the  estimation  can  be  made  by  maximizing  this 
posterior  distribution,  i.e.,  by  maximizing  the  numerator  of  (1).  This  is  called  Maximum  a 
posteriori  (or  MAP)  estimate,  and  is  given  by: 


Xmap  =  argmax  p(X  =  x\  Z  =  z)  oc  p(Z  =  z  \  X  -  x)P(X  =  x)  (2) 

Another  popular  estimation  scheme  (called  Minimum  Mean  Square  Error  (MMSE)  estimator) 
minimizes  the  sum  of  square  of  errors,  i.e.,  minimizes  the  Euclidean  distance  between  the  true 
state  and  the  estimate  after  the  observation  has  been  made. 

To  incorporate  the  measurements  from  two  sensors,  (1)  can  be  written  as: 


p(X  =  x  |  Z  =  Zj,z2)  = 


p(Z  =  Zj  |  X  =  x)p(Z  =  z2 1 X  =  x)P(X  =  x) 
P{Z  =  Zj,z2) 


(3) 


Fig.  1  shows  a  process  in  which  the  sensory  data  made  available  from  multiple  sensors  can  be 
fused  under  the  Bayesian  scheme. 
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Fig.  1.  Multiple  Sensor  Fusion  using  Bayesian  Technique 
3.  Sensor  models 

Sensor  modeling  [13]  deals  with  developing  an  understanding  of  the  nature  of  measurements 
provided  by  the  sensor,  the  limitations  of  the  sensor,  and  probabilistic  understanding  of  the 
sensor  perfonnance  in  terms  of  the  uncertainties.  The  infonnation  supplied  by  a  sensor  is  usually 
modeled  as  a  mean  about  a  true  value,  with  uncertainty  due  to  noise  represented  by  a  variance 
that  depends  on  both  the  measured  quantities  themselves  and  the  operational  parameters  of  the 
sensor.  A  probabilistic  sensor  model  is  particularly  useful  because  it  facilitates  the 
determination  of  the  statistical  characteristics  of  the  data  obtained.  This  probabilistic  model  is 
usually  expressed  in  the  form  of  probability  density  function  (p.d.f.)  p(z\x)  that  captures  the 
probability  distribution  of  measurement  by  the  sensor  (z)  when  the  state  of  the  measured  quantity 
(x)  is  known.  This  distribution  is  extremely  sensor  specific  and  can  be  experimentally 
determined.  Gaussian  distribution  is  one  of  the  most  commonly  used  distributions  to  represent 
the  sensor  uncertainties  and  is  given  by  the  following  equation: 


p(z\x) 


(4) 


The  standard  deviation  of  the  distribution  a  is  a  measure  of  the  uncertainty  of  the  data 
provided  by  sensors.  Durrant- Whyte  [14]  has  used  the  summation  of  two  Gaussian  distributions 
to  model  uncertainty  in  the  sensor  measurement.  Researchers  have  developed  a  few  other 
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methods  [15]  to  iteratively  update  the  parameters  of  the  distribution. 


3.1  Estimation  of  Sensor  Model  Parameters 

Maximum  Likelihood  (ML)  method  is  the  procedure  for  finding  the  value  of  one  or  more 
parameters  for  a  given  statistical  data  which  maximizes  the  known  likelihood  distribution.  If 
Gaussian  distribution  is  considered,  the  distribution  representing  the  sensor  model  is  given  by: 


Pd,(Z!  I  °’Xi) 


(5) 


where  the  event  D,  represents  the  data  (zi ,  y  ) ,  and  ©  =  cr  is  the  parameter  to  be  estimated.  The 
likelihood  function  is  the  joint  probability  of  the  data  given  by: 
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and  the  parameter  a  can  be  estimated  via  ML  method  by  maximizing  z,(@)  given  by  (6). 

Most  of  the  previous  research  on  sensor  fusion  was  based  on  the  development  of  rigid  sensor 
models.  In  practice,  the  perfonnance  of  sensors  or  any  source  of  information  depends  upon 
several  factors,  for  example  the  environmental  conditions  under  which  the  measurements  were 
made,  and  the  perfonnance  of  estimation/calibration  algorithm.  Establishing  dependence  of  a 
sensor’s  perfonnance  on  various  parameters  of  environment  and  other  signal/feature  extraction 
algorithms  is  not  a  trivial  task.  Statistical  techniques  such  as  correlation  analysis  can  be  used  to 
determine  the  manner  in  which  these  factors  affect  the  sensor’s  output.  Selecting  the  factors  that 
can  possibly  affect  the  sensor  output  is  difficult,  and  is  mostly  based  on  heuristics.  Many  feature 
extraction  algorithms  include  goodness-of-fit  function  that  can  be  investigated  to  observe  the 
correlation  with  sensor  output. 

After  the  factor  which  affects  the  sensor’s  performance  has  been  selected,  the  next  challenge  is 
to  establish  the  correspondence  between  the  factor  and  the  uncertainty  in  the  sensor’s  output. 
Statistical  system  identification,  regression  analysis,  or  any  mapping  algorithm  can  be 
investigated  to  establish  the  correspondence.  It  might  be  difficult,  if  not  impossible,  to  obtain  the 
mathematical  relation,  and  in  the  absence  of  such  mathematical  relation,  model  based  statistical 
approach  would  be  difficult  to  use.  In  this  research  project,  the  universal  approximation 
capabilities  of  neural  networks  have  been  used  to  establish  this  correspondence. 


3.2  Neural  Network  based  Sensor  Models 

A  neural  network  (NN)  [16-17]  is  an  information-processing  paradigm  inspired  by  the  way  in 


4 


which  the  heavily  interconnected,  parallel  structure  of  the  human  brain  processes  information. 
They  are  often  effective  for  solving  complex  problems  that  do  not  have  an  analytical  solution  or 
for  which  an  analytical  solution  is  too  difficult  to  be  found.  Currently,  they  are  being  applied  in 
many  real  world  problems  [18].  Three-layered  NNs  (i.e.,  one  input  layer,  one  output  layer  and 
one  hidden  layer),  with  hidden  layer  having  sufficient  nodes  and  a  sigmoid  transfer  function,  and 
linear  transfer  function  in  input  and  output  layer  [19-21]  are  considered  to  be  universal 
approximators.  In  this  research  project,  a  three-layered  NN  has  been  formulated  to  obtain  the 
correspondence  between  the  standard  deviation  a  of  Gaussian  distribution  and  the  parameter 
which  affects  sensor’s  performance.  The  output  of  a  typical  three-layered  NN  is  given  by: 
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(7) 


where  7’  is  total  number  of  nodes  in  the  hidden  layer,  7?’  is  total  number  of  inputs,  7zz’  is  total 
number  of  outputs,  ‘w’  is  the  weight  and  ‘b’  is  the  bias  of  the  network.  The  function  f(x)  is  the 
activation  function  associated  with  nodes.  The  input  to  the  neural  network  is  the  parameter 
vector  h,  and  the  output  is  the  a  representing  the  standard  deviation  of  the  Gaussian 
distribution.  Hence, 


(8) 


z 

f 

f  n  \ 

) 

c-  =  / 

L< 

1  WJif 

ZVt  J 

+  b1J  \  +  b2i 

[ 

\k= 1  J 

J 

or 


NNET(l,W,B ) 


(9) 


where  /  is  the  vector  representing  input  parameters,  W  is  the  weight  matrix,  and  B  is  the  bias 
matrix.  Backpropagation  (BP),  based  on  gradient  descent  technique,  is  one  of  the  most  popular 
methods  for  training  neural  networks  which  establishes  a  particular  set  of  weights  obtained  by 
adjusting  the  weights  based  on  the  errors  between  the  actual  and  target  output  signals.  For  the 
neural  network  considered  for  the  system  in  this  research,  however,  the  target  data  for  a  is 
unknown,  and  cannot  be  obtained  directly  from  experiments.  Here,  the  neural  network  is  trained 
in  a  novel  manner  from  the  signals  obtained  from  Maximum  Likelihood  parameter  estimation 
approach.  Likelihood  function  that  needs  to  be  maximized  is  given  by  (6),  in  which  parameter  a 
is  represented  by  a  neural  network  function  given  by  (8)  or  (9).  Hence,  the  likelihood  function 
that  needs  to  be  maximized  by  choosing  appropriate  weights  and  biases  of  the  neural  network  is 
given  by: 


L(W,B) 


-EG-*;)2 
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(10) 


The  weights  and  biases  can  be  found  by  the  gradient  descent  method  or  via  evolutionary 
strategies  [22],  The  above  method  has  been  used  to  obtain  models  of  infra-red  proximity  sensor 
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and  vision  sensors  in  stereo  configuration. 


3.3  Modeling  of  Stereo  Vision  Sensors 

One  of  the  most  important  components  of  stereo  vision  algorithm  is  stereo  matching  [23] 
which  involves  finding  out  the  location  of  the  point  in  right  image  plane  corresponding  to  a  point 
in  the  left  image  plane.  The  relative  displacement  of  these  two  points,  called  disparity,  is  used  to 
estimate  the  three-dimensional  position  of  the  point.  The  accuracy  with  which  stereo  vision 
sensors  would  be  able  to  specify  three-dimensional  positional  information  about  a  point  depends 
on  how  precisely  the  stereo  vision  algorithm  is  able  to  find  the  match  of  the  point.  The 
correlation  score  [24]  of  the  matched  points,  which  measures  the  correlation  between  two 
template  windows  from  left  and  right  images,  is  a  measure  of  “goodness-of-match”  of  the  two 
points.  This  score  for  template  of  size  (2n+l)  x  (2m+l)  is  given  by: 


Score(PL,PR) 


n  m  • 


UvL  +  j)~  I  L(uL,vL%{lR(iiR  +i,vR 
(in  +  \\lm  +  l)yja2(lL)xa2(lR) 


j)-IR(uR,vR\ 


(11) 


where  //  is  intensity  matrix  of  left  image,  Ir  is  intensity  matrix  of  right  image,  and  Ik(uk,vk) 
(k=L,  R)  is  the  average  value  of  intensity,  a(fk )  is  the  standard  deviation  of  image  I k  in  the 

neighborhood  of  (2n+l)  x  (2m+l)  of  (u,  v).  The  score  ranges  from  -1  to  +1,  -1  representing  not 
similar  at  all,  and  +1  representing  most  similar.  The  method  formulated  in  the  previous  section 
has  been  used  to  develop  a  model  for  the  stereo  vision  sensors  that  could  take  into  account  the 
performance  of  the  stereo  matching  algorithm. 

An  experiment  was  carried  out  in  the  RAMA  Laboratory,  wherein  a  set  of  fifty  data  points 
consisting  of  3-D  location  of  point  in  world  coordinate  system  obtained  via  stereo  vision  sensors 
(via  transformation  as  discussed  in  reference  [23]),  correlation  score  for  that  point  (given  by 
(11)),  and  the  actual  3-D  location  of  the  point  in  world  coordinate  frame.  The  value  of 
correlation  between  the  correlation  score  of  stereo  match  of  two  image  points  and  the  error 
associated  with  that  point  in  3-D  coordinates  was  found  to  be  -0.3780,  -0.2131,  and  -0.2856 
respectively  in  X,  Y,  and  Z  direction.  The  error  represents  the  absolute  error  between  the  actual 
3-D  location  of  a  point  and  that  obtained  from  the  stereo  vision.  A  negative  correlation  value 
represents  that  when  correlation  score  is  large,  the  error  is  small,  which  logically  follows  from 
the  fact  that  larger  correlation  score  means  better  stereo  match  and  better  estimation  of  3-D 
positional  information. 

The  strategy  described  in  the  previous  section  was  used  to  develop  a  Gaussian  model  of  the 
sensor.  In  this  model  the  standard  deviation  of  the  distribution,  which  represents  the  uncertainty 
of  the  data,  is  dependent  on  the  correlation  score  for  the  specific  point.  This  dependence  was 
modeled  with  the  help  of  a  neural  network  with  five  nodes  in  the  hidden  layer.  This  neural 
network  takes  correlation  score  as  input,  and  outputs  the  value  of  standard  deviation  (sigma)  for 
that  particular  correlation  score.  In  order  to  ensure  global  convergence,  the  neural  network  was 
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trained  via  Genetic  Algorithm.  The  sensor  model  obtained  from  this  approach  showed  the 
intuitive  trend  that  as  the  correlation  score  increases,  i.e.,  as  the  stereo  match  gets  better,  the 
standard  deviation  decreases.  Lesser  value  of  standard  deviations  implies  that  the  positional 
information  obtained  from  stereo  vision  is  less  uncertain,  and  hence  the  degree  of  belief  in  the 
sensor  output  is  more. 

In  order  to  investigate  if  anything  was  gained  by  carrying  out  the  modeling  with  the  help  of 
neural  network  trained  by  maximum  likelihood  signal,  the  standard  deviation  was  obtained  by 
maximizing  the  likelihood  function  given  by  (6).  This  provided  a  constant  value  of  standard 
deviation  (sigma),  representing  rigid  sensor  model,  for  X,  Y,  and  the  Z  directions  for  the  same 
set  of  data.  The  NN  based  modeling  that  incorporated  correlation  score  of  stereo  matching  was 
able  to  improve  the  likelihood  function  by  a  factor  1.213,  1.1021,  and  1.1288  in  X,  Y  and  Z 
directions  respectively  as  compared  to  rigid  sensor  modeling  approach.  Hence,  NN  based 
modeling  method  promises  to  provide  statistically  more  optimal  and  accurate  results. 

3.4  Modeling  of  Proximity  Sensors 

The  output  of  the  IR  proximity  sensor  is  an  analog  voltage  which  is  indicative  of  the  distance 
of  the  object  detected  by  the  sensor.  In  order  to  calibrate  the  IR  sensor  readings,  750  data  points, 
comprising  of  sensor  output  values  in  volts  and  actual  distance  to  the  sensed  object  in  mm,  were 
taken.  The  calibration  was  obtained  via  neural  network  which  accepted  sensor  output  in  volts  as 
input  and  provided  distance  in  mm  as  output.  The  neural  network  was  trained  from  750  data 
points  obtained,  and  tested  with  the  help  of  a  separate  set  consisting  of  700  data  points.  Fig.  2 
shows  the  plot  of  neural  network  calibration,  and  test  data.  The  plot  shows  sensor  output  (which 
is  input  to  neural  network)  along  Y  axis,  versus  distance  (which  is  output  of  the  neural  network) 
along  X  axis. 


Fig.  2.  Neural  Network  Calibration  of  IR  Proximity  Sensor 

Once  the  calibration  of  the  IR  sensor  was  achieved,  the  next  step  was  to  develop  the 
probabilistic  model  of  the  sensor  based  on  the  theory  presented  in  the  previous  section.  In  order 
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to  develop  the  sensor  model,  the  most  important  aspect  is  to  determine  the  factors  that  can 
possibly  affect  the  performance  of  the  sensor.  From  Fig.  2,  where  round  dots  show  the  test  data 
obtained  from  experiments,  it  can  be  easily  seen  that  the  uncertainty  in  data  increases  as  the 
distance  to  the  object  increases.  This  can  be  observed  from  the  horizontal  spread  in  the  data  as 
the  distance  to  object  increases.  A  larger  horizontal  spread  means  that  for  the  same  value  of 
sensor  output,  the  discrepancy  in  the  actual  distance  to  the  object  is  more.  In  this  research 
project,  the  model  of  the  IR  sensor  attempts  to  capture  this  dependence  of  uncertainty  on  the 
distance  to  the  object.  Since  the  output  of  the  sensor  is  indicative  of  the  distance,  this 
investigation  makes  use  of  neural  network  technique  outlined  in  previous  section  to  capture  the 
relationship  between  sensor’s  uncertainties  and  sensor  output. 

In  the  laboratory  experiments,  the  Infra-Red  sensor  was  mounted  on  the  wrist  of  the  robot  so 
that  it  looked  vertically  down  (negative  Z  direction  in  world  coordinate  frame).  The  IR  sensor 
provided  the  information  about  the  distance  to  the  nearest  object  detected  directly  in  front  of  the 
object.  Information  about  the  position  of  end  effector  of  the  robot  was  obtained  from  the 
encoders  of  the  robot.  Hence,  IR  sensor  can  be  effectively  used  in  conjunction  with  robot 
encoders  to  provide  3-D  information  about  any  object.  Similar  to  the  case  of  vision  sensor,  the 
model  of  the  IR  sensor  has  been  obtained  in  all  three  X,  Y,  and  Z  directions.  The  correlation 
value  of  the  sensor  output  of  a  point  and  the  error  associated  with  that  point  was  found  to  be  - 
0.3078,  -0.3211,  and  -0.2744  respectively  in  X,  Y,  and  Z  directions.  The  error  represents  the 
absolute  error  between  the  actual  3-D  location  of  a  point  and  that  obtained  from  the  infra-red 
sensor.  A  negative  correlation  value  represents  that  when  sensor  reading  is  smaller  (i.e.,  distance 
to  the  detected  object  is  larger),  the  error  is  larger,  which  follows  from  large  horizontal  spread  in 
Fig.  2  when  the  sensor  reading  is  smaller.  Neural  network  based  modeling  technique  was  used  to 
obtain  the  senor  model  which  represented  the  dependence  of  sensor’s  perfonnance  and  inherent 
uncertainty  on  the  distance  to  the  detected  object.  The  variation  of  standard  deviation  of  the 
Gaussian  sensor  model  obtained  from  this  approach  showed  a  decrease  when  the  sensor  output 
increased  which  implies  that  when  the  distance  to  the  object  decreases  (i.e.  sensor’s  output  is 
larger)  the  standard  deviation  becomes  smaller,  and  the  sensor’s  measurement  becomes  less 
uncertain.  This  is  also  confirmed  by  Fig.  2  as  well  as  the  negative  correlation  value  obtained 
above. 

An  analysis,  similar  to  the  one  perfonned  for  vision  sensors,  was  carried  out  to  study  the 
advantage  of  the  proposed  approach  over  the  constant  standard  deviation  (rigid  sensor  model) 
obtained  by  maximizing  (6).  The  proposed  NN  based  modeling  was  able  to  improve  the 
likelihood  function  by  a  factor  of  3.6690,  27.3505,  and  2.3414xl04  in  X,  Y,  and  Z  directions 
respectively  as  compared  to  that  obtained  via  rigid  sensor  modeling.  Larger  values  of  likelihood 
functions  for  modeling  based  on  neural  network  method,  similar  to  the  case  of  stereo  vision, 
reaffirms  the  fact  that  the  proposed  neural  network  based  sensor  modeling  technique  was  more 
optimal  and  accurate. 


4.  Sensor  fusion  in  occupancy  grids 

The  occupancy  grid  [25-28]  is  a  multi-dimensional  field  (usually  of  dimension  two  or  three) 


where  each  cell  (or  unit  of  the  grid)  stores  or  represents  the  probabilistic  estimate  of  the  state  of 
spatial  occupancy.  Occupancy  grids  are  one  of  the  most  common  low-level  models  of  an 
environment,  which  provide  an  excellent  framework  for  robust  fusion  of  uncertain  and  noisy 
data.  If  the  state  variable  (occupancy,  in  this  case)  associated  with  a  cell,  Q,  is  denoted  by  s(Cj), 
then  the  occupancy  probability  P[^(C,  )]  represents  the  probabilistic  estimate  of  occupancy  of  that 
particular  cell.  If  p[s(C,.)  =  occ ]  «  0 ,  then  the  cell  is  assumed  to  be  empty,  while,  if  P\s{Ci )  =  occ ] « 1, 
then  the  cell  is  assumed  to  be  occupied. 


If  a  single  sensor  is  used  to  obtain  the  occupancy  grid,  Bayes’  Theorem  can  be  used  in  the 
following  manner  to  determine  the  state  of  the  cell: 


P[s(C,)  =  occ  |  z]  = 


p[z  |  s(C;)  =  occ]P[s(Cj)  =  occ ] 


S(C, ) 


(12) 


where  z  is  the  sensor  measurement.  The  probability  density  function  (p.d.f.)  p[z\s(Ci)  =  occ]  is 
dependent  on  the  sensor  characteristics  and  is  called  the  sensor  model.  The  probability 
P[s(C, )  =  occ]  is  called  prior  probability  mass  function  and  specifies  the  information  made 
available  prior  to  any  observation. 

The  sensor  models  obtained  in  Section  III  represent  the  probability  of  a  sensor  providing  the 
position  z  when  actual  position  of  the  object  is  x.  These  models  relate  the  probability  of 
occurrence  of  an  object  to  the  distance  from  the  measurement  (in  one  dimension),  given  that  the 
object  was  detected  by  the  sensor.  These  models  (likelihood  functions)  alongwith  any  prior 
information  is  used  to  obtain  the  posterior  distribution  p{x\z)  which  represents  uncertainty  in  the 
positional  information  supplied  by  the  sensors.  In  this  research,  an  occupancy  profile  has  been 
recreated  with  the  help  of  three-dimensional  occupancy  grids.  Each  cell  of  the  grid  represents  a 
three-dimensional  space  in  the  world  coordinate  system.  The  occupancy  profile  of  the  workspace 
is  obtained  by  determining  whether  the  cells  of  the  grids  are  occupied  or  not.  A  cell  is  said  to  be 
occupied  if  the  spatial  region  represented  by  this  cell  has  at  least  one  point  (detected  by  sensors) 
within  it.  Fig.  3  shows  a  situation  in  which  two  points  contribute  to  the  probability  of  a  cell  (in 
one  dimension)  being  occupied.  The  probability  that  point  A  lies  within  the  cell  C„  represented 
by  the  space  from  x,./  to  xh  can  be  obtained  from  the  equation: 


P(A  eCi)=  j  p{x\ zA  )dx 


(13) 


where  z,..,  is  the  sensor  reading,  and  p(x \zA)  is  the  posterior  distribution.  Similarly,  for  point  B : 


P(B  eC,.)=  \p(x\zB)lx 


(14) 
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The  probability  that  either  A  or  B  is  in  the  cell  C,  is  given  by: 
F\(A  <e  C,  P(B  e  C, )]  =  P{A  e  C, )  +  P(B  e  C, )  -  P(A  e  C, .  ^  C, ) 


(15) 


The  above  equation  makes  an  assumption  that  detection  of  points  A  and  B  are  independent 
processes.  The  above  approach  can  be  used  to  determine  the  probability  of  cell  being  occupied 
by  at  least  one  point  among  any  number  of  points  detected  in  the  vicinity,  and  is  used  to  obtain 
individual  occupancy  grids  from  IR  proximity  sensor  and  stereo  vision. 
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Fig.  3.  Two  Points  Contributing  to  Occupancy  of  a  Cell 


4.1  Fusion  of  Two  Occupancy  Grids 

The  fusion  of  grids  obtained  from  individual  sensors  is  carried  out  by  determining  probabilistic 
estimates  at  the  cellular  level.  Most  of  the  previous  research  work  makes  use  of  probabilistic 
evidence  combination  formula  based  on  Independent  Opinion  Pool  [29]  and  use  of  maximum 
entropy  priors  [25,  27].  This  method  is  applied  at  each  cell  to  fuse  two  probabilistic  estimates,  Pi 
and  P2,  associated  with  the  two  grids  obtained  from  two  sensors.  The  mathematical  expression 
representing  the  fusion  is  given  by: 


P[s(Ci)  =  occ\Px,P2}  = 


P  P 

'  1  '2 


p,p1+(i-pji-p2) 


(16) 


There  are  certain  problems  associated  with  the  use  of  this  equation  to  fuse  two  estimates.  For 
example,  this  equation  fails  to  provide  a  reasonable  estimate  if  the  two  sensors  have  completely 
contradictory  measurements.  Moreover,  this  equation  puts  equal  amount  of  belief  in  the  two 
measurements,  and  would  provide  incorrect  estimates  if  the  reliability/accuracy  of  the  two 
sensors  that  are  being  fused  vary  by  a  large  amount.  In  this  research,  for  example,  stereo  vision 
sensors  are  fused  with  infra-red  sensors.  The  stereo  vision  sensors  provide  very  accurate 
measurements  in  X  direction,  while  the  infra-red  sensor  does  not.  It  would  be  unreasonable  to 
provide  equal  weight  to  the  probabilistic  estimates  obtained  from  these  two  sensors.  In  this 
research  project,  the  two  estimates  are  fused  in  a  Bayesian  framework  taking  into  account  the 
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certainty  and  reliability  of  the  sensors.  As  described  in  previous  sections,  sensor  models  are 
given  by  the  following  Gaussian  likelihood  function: 


P(zk  I  x) 


k=l,2 


where  k=l  represents  stereo  vision,  and  k=2  represents  IR  sensor. 
Then,  from  Bayes’  Theorem  the  fused  MAP  estimate  is  given  by: 

A 

Xmap  =  argmax[p(zj \x)p{z2  |x)]  or 


(17) 


x  map  =  argmax 


which  gives: 


1 


(x-zl  )2  |  (x-:2  )2 


2<t.  2<j9 


(T  \  (7  2  2  TC 


cr. 


Xmap  =  - 


-z,  +  - 


cr. 


-z,  =  • 


2  1  2  2  2  2  i 

2  cr,  +  a  2  r  + 1 


zi  +' 


1  + 


(18) 


(19) 


where,  r  =  CT'/  is  the  ratio  of  standard  deviations.  Hence,  if  there  is  no  prior  information 
/o-2 


available  about  the  quantity  to  be  estimated,  the  Bayesian  approach  for  fusion  of  the  two  sensor 
estimates  results  in  a  weighted  average  dictated  by  the  ratio  of  standard  deviations.  If  two 
Gaussian  distributions  (given  by  the  two  sensor  model’s  pdfs)  are  fused,  then  the  posterior 
distribution  is  jointly  Gaussian  with  a  mean  given  by  (19)  and  the  standard  deviation  given  by: 


((J  )2  =  [(<^1 )  2  +(cr2)'2. 


(20) 


Fig.  4  shows  the  two  distributions  that  get  fused  to  give  the  posterior  distribution.  The  simple 
product  of  distribution  is  also  shown  in  the  figure.  It  may  be  noted  from  the  figure  that  the 
standard  deviation  of  fused  distribution  is  smaller  representing  lesser  uncertainty  in  fused 
estimates. 
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Fig.  4.  Fusion  of  Two  Gaussian  Distributions 


5.  Experimental  results  and  discussion 

The  theories  developed  in  the  previous  sections  were  validated  with  the  help  of  experiments 
performed  in  the  Robotics  and  Manufacturing  Automation  (RAMA)  Laboratory  at  Duke 
University.  A  cylindrical  object  was  placed  on  the  work-table.  Fig.  5  shows  the  images  of  the 
work-table  obtained  from  the  stereo  cameras.  Fig.  6(a)  shows  the  actual  occupancy  grid  of  the 
workspace.  This  was  obtained  based  on  the  geometric  dimensions  of  the  object  and  its  location 
in  the  workspace.  For  the  occupancy  grid  developed  in  this  research,  each  grid  is  of  size  5mm  X 
5mm  X  5mm. 


Fig.  5.  Images  of  the  Worktable  Obtained  from  Left  and  the  Right  Camera 

Once  the  data  points  were  obtained  from  the  IR  proximity  sensor  and  stereo  vision  sensors, 
Equations  (13)  to  (15)  were  applied  to  find  the  probabilistic  estimates  of  the  occupancy  of  cells 
of  the  grid,  and  two  separate  occupancy  grids  obtained  from  the  IR  and  the  stereo  vision  were 
developed.  If  the  probability  of  the  cell  being  occupied  was  />[>(C,)  =  occ]>0.5,  the  cell  was 

assumed  to  be  occupied.  The  occupancy  grid  obtained  from  fusion  of  the  two  grids  via  Bayesian 
approach  described  in  previous  section  is  shown  in  Fig.  6(b). 

The  occupancy  grids  obtained  above  were  based  on  the  sensor  models  derived  from  the 
proposed  neural  network  based  learning  scheme.  If  the  sensor  models  are  assumed  to  be  rigid 
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(standard  deviation  does  not  depend  on  any  parameter  and  remains  constant  for  all  data  points),  a 
different  set  of  occupancy  grids  is  obtained.  Fig.  6(c)  shows  the  occupancy  grid  fused  from  these 
two  grids.  In  order  to  carry  out  the  comparison,  fused  occupancy  grid  was  also  obtained  from  the 
independent  opinion  pool  method  given  by  (16),  and  the  fused  grid  is  shown  in  Fig.  6(d) 


Fig.  6.  Occupancy  Grids  a)  Actual  Grid,  b)  Fused  Grid  (Proposed  NN  based  Approach),  c) 
Fused  Grid  (Rigid  Sensor  Model  Approach),  d)  Fused  Grid  (Independent  Opinion  Pool  Method) 

To  facilitate  the  comparison  of  performance  of  the  fusion  process  via  different  algorithms,  a 
measure  of  error  was  formulated  which  is  given  by  the  following  equation: 

Error  =  X  ls(Cf  -|s(Ci)L,F  (29) 

c, 


where  s(c  I  is  the  actual  state  of  the  cell,  and  \s(c  )|  is  the  state  of  the  cell  obtained  from 

the  sensor  and/or  fusion  process.  The  state  of  the  cell  is  either  1  (for  occupied)  or  0  (for  empty). 
The  value  of  this  error  associated  with  the  occupancy  grids  obtained  from  raw  sensor 
measurements  are  1062  and  1279  respectively  for  IR  proximity  sensor  and  stereo  vision.  Table  I 
provides  the  error  value  associated  with  the  occupancy  grid  obtained  from  the  fusion  process 
described  above.  The  table  compares  the  error  value  obtained  via  the  two  approaches.  The  first 
approach  is  based  on  the  proposed  neural  network  oriented  sensor  modeling  scheme,  and  the 
second  approach  is  based  on  the  rigid  sensor  model. 
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Table  I 


Error  Associated  with  Occupancy  Grids  Obtained  from  Fusion  Process 


NN  Based  Modeling  Approach 

Rigid  Sensor  Model  Approach 

IR  Sensor  Stereo  Vision  Fused 

IR  Sensor  Stereo  Vision 

Fused 

1070  1160  956 

1070  1279 

1214 

From  the  figures  as  well  as  from  the  table  of  results,  it  is  evident  that  the  sensor  modeling  and 
fusion  scheme  presented  in  this  research  report  have  been  able  to  reduce  the  uncertainty  inherent 
in  individual  sensors.  The  fusion  process  derived  from  proposed  neural  network  based  sensor 
modeling  scheme  has  been  able  to  reduce  the  error  associated  with  the  occupancy  grid  by  over 
25%  with  respect  to  raw  stereo  vision  sensor  measurements,  and  by  approximately  10%  with 
respect  to  raw  IR  proximity  sensor  measurements.  On  the  other  hand,  the  results  from  fusion 
based  on  rigid  sensor  model  scheme  are  not  particularly  impressive.  Although  the  occupancy 
grid  obtained  from  this  method  had  a  reduction  of  error  by  5%  with  respect  to  raw  stereo  vision 
sensor  measurement,  the  fused  grid  had  an  increase  in  error  by  over  14%  with  respect  to  raw 
proximity  sensor  measurements.  The  major  reason  for  its  poor  performance  is  that  this  procedure 
assumes  equal  uncertainty  (given  by  constant  standard  deviation  of  sensor  model)  associated 
with  all  data  points.  This  assumption  leads  to  erroneous  calculation  of  probabilistic  estimates  as 
shown  by  an  increase  in  the  error  value.  The  error  associated  with  the  occupancy  grid  obtained 
from  independent  opinion  pool  method  was  found  to  be  1063.  Hence,  it  shows  no  improvement 
when  compared  to  raw  proximity  sensor  measurement,  and  shows  an  improvement  of 
approximately  17%  with  respect  to  stereo  vision  sensor  measurements.  This  method  tends  to 
yield  high  probabilistic  evidence  when  both  the  sensors  are  in  agreement,  and  gives  low 
probabilistic  evidence  when  at  least  one  sensor  provides  lower  value  compared  to  the  other. 
Independent  opinion  pool  method  indicates  equal  belief  in  the  probabilistic  evidence  provided  by 
two  different  sources.  For  fusion  problem  investigated  in  this  research  that  involved  sensors 
which  have  large  difference  in  reliability/accuracy  of  their  measurement,  this  method  is  not 
suitable. 


6.  Conclusion 

This  research  project  proposes  a  novel  technique  that  utilizes  learning  and  optimizing 
capability  of  neural  networks  to  obtain  a  sensor  model  that  automatically  learns  some  of  the  key 
statistical  relations  from  the  data.  The  research  results  present  a  method  of  obtaining  occupancy 
profile  of  the  environment  based  on  a  three-dimensional  occupancy  grid  framework,  and  present 
a  novel  method  for  obtaining  probabilistic  estimates  for  the  occupancy  of  cells  from  two 
different  kind  of  sensory  sources.  The  fusion  of  information  from  sensors  has  been  carried  out 
under  Bayesian  framework,  and  its  performance  has  been  compared  with  respect  to  other 
strategies.  It  was  seen  that  the  fusion  process  carried  out  under  proposed  sensor  modeling 
strategy  based  on  neural  network  had  a  superior  performance  than  that  of  fusion  process  based 
on  rigid  sensor  model  and  that  of  fusion  based  on  independent  opinion  pool  method. 
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8.  Publications  resulting  from  this  project 

•  Kumar,  M.,  Garg,  D.,  and  Zachery,  R.,  “Multi-Sensor  Fusion  Strategy  to  Obtain  3-D 
Occupancy  Profile”,  accepted  for  publication  in  the  Proceedings  of  the  31st  Animal 
Conference  of  the  IEEE  Industrial  Electronics  Society  (IECON),  Raleigh,  NC,  November 
2005. 

•  Kumar,  M.,  Garg,  D.,  and  Zachery,  R.,  “Intelligent  Sensor  Modeling  and  Data  Fusion  via 
Neural  Network  and  Maximum  Likelihood  Estimation”,  accepted  for  publication  in  the 
Proceedings  of  the  ASME  International  Mechanical  Engineering  Congress  and  Exposition, 
Orlando,  FL,  November  2005. 

•  Kumar,  M.,  Garg,  D.,  and  Zachery,  R.,  “Intelligent  Sensor  Uncertainty  Modeling  Techniques 
and  Data  Fusion”,  submitted  to  IEEE  Transactions  on  Mechatronics,  2005. 
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